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Abstract 

The scenario-based optimization approach ('scenario approach') provides an intuitive way of approx- 
imating the solution to chance-constrained optimization programs, based on finding the optimal solution 
under a finite number of sampled outcomes of the uncertainty ('scenarios'). A key merit of this approach 
is that it neither assumes knowledge of the uncertainty set, as it is common in robust optimization, nor of 
its probability distribution, as it is usually required in stochastic optimization. Moreover, the scenario ap- 
proach is computationally efficient as its solution is based on a deterministic optimization program that is 
canonically convex, even when the original chance-constrained problem is not. Recently, researchers have 
obtained theoretical foundations for the scenario approach, providing a direct link between the number of 
scenarios and bounds on the constraint violation probability. 

These bounds are tight in the general case of an uncertain optimization problem with a single chance 
constraint. However, this paper shows that these bounds can be improved in situations where the con- 
straints have a limited 'support rank', a new concept that is introduced for the first time. This property is 
typically found in a large number of practical applications — most importantly, if the problem originally 
contains multiple chance constraints (e.g. multi-stage uncertain decision problems), or if a chance con- 
straint belongs to a special class of constraints (e.g. linear or quadratic constraints). In these cases the 
quality of the scenario solution is improved while the same bound on the constraint violation probability 
is maintained, and also the computational complexity is reduced. 

Key words: Uncertain Optimization, Chance Constraints, Randomized Methods, Convex Optimization, 
Scenario Approach, Multi-Stage Decision Problems. 

1 Introduction 

Optimization is ubiquitous in modern problems of engineering and science, where a decision or design 
variable x G W l has to be selected from a constrained set X C R d and its quality is measured against 
some objective or cost function fo : M. d — > K. A common difficulty in these problems is the lack of 
precise information about some of the underlying data, so that for a particular solution it remains uncertain 
whether the constraint are satisfied and which objective or cost value is achieved. This uncertainty shall 
be denoted by an (unknown) abstract variable { £ A, where A is understood as the uncertainty set of 
a non-specified nature. It may affect the target function fo and/or the feasible set X, where the latter 
situation represents a particular challenge since good solutions are usually found close to the boundary of 
this set, which requires a trade-off between the objective value and the strictness of constraint satisfaction. 

"This manuscript is the preprint of a paper submitted to the SI AM lournal on Optimization and it is subject to SIAM copyright. 
SIAM maintains the sole rights of distribution or publication of the work in all forms and media. If accepted, the copy of record will 
be available at http://www.siam.org 

t Automatic Control Laboratory, Swiss Federal Institute of Technology, Zurich, Switzerland 
(schildbach | morari@ control .ee.ethz.ch). 

tDipartimento di Automatica e Informatica, Polytecnico di Torino, Torino, Italy and Department of Mechanical Engineering, 
University of California at Santa Barbara, Santa Barbara (CA), United States (lorenzo . f agiano@polito . it). 



1 



A large variety of approaches addressing this issue have been proposed in the areas of robust and stochastic 
optimization |3 4 6][16][17][19 21 , 23], with the preferred method-of-choice depending on the particular 
problem at hand. 

For many practical applications, the formulation of chance-constraints has proven to be the appropriate 
concept to handle uncertainty. Here the uncertainty 6 is assumed to have a stochastic nature, and the decision 
variable x is allowed to fall outside of the feasible set X, but only with a probability no higher than a given 
threshold e G (0, 1). Yet chance-constrained optimization programs generally remain difficult to solve. In 
special cases, they can be tackled by stochastic optimization techniques, when also a probability distribution 
function for 6 is assumed [6, 16,21 23 1. However, their deterministic reformulation is usually non-convex, 
and it can be difficult even to find a feasible point with a good objective function value. 

Recent contributions have shown that randomized algorithms are a viable approach for finding, with 
high confidence, a sub-optimal solution to chance-constrained programs; see ||9]-[T3l and the references 
therein. They propose a scenario-based optimization approach {'scenario approach') for finding values of 
x that are, with a very high confidence, feasible with respect to the chance constraints and achieve good 
objective function values. Their theory considers the following general class of problems: 

min c T x , (1-la) 
s.t. Pr{<5 G A | x G X(S)} > (1 -e) , (1.1b) 

where c G M. d defines the linear objective function, X C M d represents a deterministic convex constraint 
set, and X{5) C M. d is a convex set that depends on the uncertainty 5 G A, in which the decision variable 
has to lie with a probability of at least 1 — e. While Jl.lt appears to be restrictive on first sight, it actu- 
ally encompasses a vast range of problems — namely any uncertain optimization program which is convex 
whenever the value of the uncertainty S is fixed. 

The fundamental idea of the scenario approach is simple and intuitive: it draws a finite number K G N 
of independent and identically distributed (i.i.d.) samples of the uncertainty 5, and proposes to use the so- 
lution of the optimization program ('scenario program') obtained by replacing the chance constraint (11.1b ) 
with the resulting K sampled constraints, which are deterministic and convex. Strong results have been 
developed by II10II12I . providing tight bounds for the choice of K when linking it directly to the probability 
with which the solution to the scenario program {'scenario solution') violates the chance constraint dl.lb ). 
Furthermore as shown by 111011131 . these bounds extend to the case where after sampling K constraints a 
given number of R G N sampled constraints are removed, a procedure that can be applied to improve the 
objective value of the scenario solution. 

A key advantage of the scenario approach is that it neither requires knowledge of the distribution func- 
tion of the uncertainty nor of the uncertainty set A, which may be of a very general nature; it only assumes 
the availability of a sufficient number of i.i.d. random samples of <5. Therefore it could be argued that 
the scenario approach is at the heart of any robust and stochastic optimization method, because either the 
uncertainty set A or the probability distribution of 8 are constructed based on some (necessarily finite) 
experience of the uncertainty. Another computational advantage is that the solution is obtained through a 
deterministic convex optimization program, for which numerous efficient algorithms and solvers are readily 
available HjlEO) . 

While the existing theory on the scenario approach, and also in stochastic programming, can only handle 
single chance constraints, as in Jl.lt . problems with multiple chance constraints are also of interest: 

min c T x , (l-2a) 

s.t. Pr{<5 G A | x G Xi{5)} > (1 - e t ) V i G Nf , (1.2b) 

where i is the chance constraint index running in := {1,2,..., N}. One motivation for this is that 
systems in science and engineering, which are steadily increasing in size and complexity, often comprise 
multiple sub-systems that are individually subject to uncertain constraints, yet all sub-systems have to be 
considered integratively for an optimal operation of the entire system. Another motivation are multi-stage 
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stochastic decision problems [6 Cha. 7], Ifl6l Cha. 8] ETl Cha. 13] ||23l Cha. 3] in which typically one 
chance constraint needs to be enforced per time step, while the optimal decision problem comprises multiple 
time steps. With view on the latter, the N chance constraints of ( 11.2b will also be referred to as 'stages' in 
this paper; yet the multiple chance constraints are treated in a general, i.e. not necessarily temporal, manner. 

Note that, in principle, a sub-optimal solution to (11.2b can be obtained by means of finding a sub-optimal 
solution to ( 11.11 ), if the a single chance constraint is set up as follows: 

X := X\ n X 2 n ... R Xn an d £ := min{ei,£2, — ,£jv} • (1.3) 

However, this procedure may introduce a considerable amount of conservatism, as it requires the scenario 
solution x to lie simultaneously inside all constraint sets Xi with the highest of all probabilities 1 — £j. 
Apparently, this conservatism increases for a larger number of chance constraints N and a greater variation 
in the values of £j. 

The fundamental contribution of this paper is to reduce the existing sample bounds if the scenario 
approach is applied to stochastic programs with multiple chance constraints — with the key effects of an 
improvement of the quality of the scenario solution and a decrease of the computational complexity, while 
the probabilistic guarantees remain the exactly the same as before. The developed theory also applies to 
many single-stage problems, where the existing sample bounds (which are tight in the general case) are 
reduced in cases when the chance constraint carries a special property, as it is held by many common 
constraint classes. Intuitively speaking, the property is that the chance constraint fails to constrain some 
'directions' of the search space, like a linear constraint with an uncertain right-hand side leaves a (d — 1)- 
subspace of R d unconstrained. 

The paper is organized as follows. Section|2]exhibits a technical statement of the problem, and Section[3] 
provides further background on its properties. SectionBn traduces the new concept of the 'support rank' of 
a chance constraint, which directly leads to the derivation of the main results of this paper: sample bounds 
for the multi-stage stochastic program are derived in Section [5] and a sampling-and-discarding procedure 
is described in Section [6] Finally, Section |7]presents an example that demonstrates the potential benefits of 
this theory in certain applications, when compared to the classic scenario approach. 

2 Problem Formulation 

2.1 Stochastic Program with Multiple Chance Constraints 

Consider a stochastic optimization problem with linear objective function and multiple chance constraints, 
which shall be referred to as the Multi-Stage Stochastic Program MSP: 

min c T x , (2.1a) 

s.t. Pr[fi{x,S) <0] > (l-£i) VieNf. (2.1b) 

Here c G R d defines a linear target function, and X C M. d represents a compact and convex set, which can 
be regarded as the combination of any set of deterministic convex constraints. It also fulfills the technical 
purpose of ensuring the boundedness of any solution to (12. It . Pr denotes a probability measure on the 
uncertainty set A, with respect to the random variable S G A. The chance constraints i G in ( 12.1b ) will 
be referred to as stages of the MSP, in order to avoid ambiguities in the nomenclature of this paper. They 
involve constraint functions fi : M. d x A — > R that depend on the optimization variable x G M. d as well as 
on the random uncertainty 6 G A in the following manner. 

Assumption 2.1 (Convexity) The constraint functions •) of every stage i G satisfy the property 
that for almost every uncertainty S G A, /,(•, 6) is a convex function in the optimization variable. 

Except for Assumption 12.11 the constraint functions /;(•, •) are entirely arbitrary; in particular, their 
dependence on the random variables 6 is completely generic. 
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The inequality 'fi(x,S) < 0' in a chance constraint of ( 12.1b ) is referred to as the nominal chance 
constraint, while £j 6 (0, 1) represents the chance constraint level. For any given x € R d , the nominal 
chance constraint characterizes the subset of A whose complement needs to be controlled (in its probability 
measure) by £j, in order for x to be considered as a feasible point of the particular stage i. This becomes 
more obvious by noting that the probability notation in (12.1b ) stands short for 

Pr[fi(x, S) < 0] := Pr{<5 e A | fi(x, 5)<0} , (2.2) 

avoiding the cumbersome way of writing out the probability as the measure of a subset of A. Both notations 
are used in the sequel, where the short notation (indicated by square brackets) is preferred for the sake of 
simplicity and the long notation (indicated by curly brackets) is used when it brings additional clarity to the 
expression. 

The use of 'min' instead of 'inf ' in d2.1b ) is justified by the fact that the feasible set of a single stage 
can be shown to be closed for very general cases Ifl6] Thm. 2.1]. The feasible set is then compact, by the 
presence of X, and any infimum is indeed attained; otherwise 'inf can always be replaced by 'min' for 
finding an optimal point in the closure of the feasible set. 

The results presented in this paper are distribution-free in the sense that they do not rely on the knowl- 
edge of the measure Pr, or the corresponding probability distribution of S. It is required however that such 
a probability measure on A exists, so that Pr can be used throughout this paper for analysis purposes. In 
fact, not even the support set A, which can be very general in nature needs, needs to be known. 

Instead, it is assumed that a sufficient number of independent random samples can be obtained from 6. 
Moreover, it remains a standing assumption that the er-algebra of Pr-measurable sets in A is large enough 
to contain all sets whose probability measure is used in this paper, like the one in (12.2b . These requirements 
appear to be reasonable for a large range of practical applications, see also [ 12, p. 4] 

Remark 2.2 (Generality of Problem Formulation) The formulation ( 12.11 ) comprises, in particular, the 
following problem settings. (Some of them actually appear in the example of Section^) (a) A random, 
non-linear, convex objective function can be included by an epigraph reformulation, with the new objective 
being a scalar and hence linear fiE} Sec. 3.1.7]. (b) The constraint function of a vector-valued nominal 
chance constraint, also known as a joint chance constraint, can be transformed into a scalar, non-linear, 
convex constraint function by taking the point-wise maximum of its component functions, preserving its 
convexity fiE\ Sec. 3.2.3]. (c) If different stages depend on different uncertain variables, all uncertainties 
can be combined into the vector uncertainty S, as not all stages have to depend explicitly on all entries of 
S. (d) Any non-random and convex constraints can be readily included as part of the convex and compact 
set X. 

The following assumption is made in order to avoid unnecessary technical issues, which are of little 
relevance in most practical applications; compare lfj"2~l Ass. 1]. 

Assumption 2.3 (Existence and Uniqueness) (a) Problem ( 12. Il l admits at least one feasible point. By the 
compactness o/X, this implies that there exists at least one optimal point of ( 12. Il l, (b) If there are multiple 
optimal points of ( 12. Il l, a unique one is selected by the help of an appropriate tie-break rule ( e.g. the 
lexicographic order on M. d ). 

2.2 The Randomization Approach 

From the basic literature on stochastic optimization, e.g. Il6lll6l21ll23l . it is known that the MSP is generally 
a hard problem to solve. Even in the case when the joint distribution of 6 is known, ( 12. Il l is equivalent 
to the solution of a non-convex deterministic optimization program, except for a few very special cases. 
Furthermore, the probability calculations involve the computation of convolution integrals, which generally 
have to be approximated by numerical means. 

The randomized optimization approach proposed in this paper can be used to obtain an approximate 
solution to the MSP, which is a feasible point of every stage i = 1,...,N with a selected confidence 
probability (1 — 6*;). It is closely related to the theory on the scenario approach, as described e.g. in fToHl3l . 
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Suppose that for each stage i G a certain number of X, e N random samples (or 'uncertainty 
scenarios' ifTTl ) is drawn for the uncertainty 5. They are denoted by <5( l ' Ki ) for Ki E and, 
for brevity of notation, also as collective multi-samples to™ :— {S^ 1 ' 1 ' , 8^' Ki '}. The collection of all 
samples shall be combined into the multi-sample lj := {u)^\ u^}, with their total number given by 
K := 2~2iLi Ki- AH of these samples should be considered as 'identical copies' of the random uncertainty 
8, in the sense that they are themselves random variables and satisfy the following key assumption. 

Assumption 2.4 (Independence and Identical Distribution) The sampling procedure is designed such 
that the set of all random samples together with the actual uncertainty, 

|J {8^\...,8^}u{8} , 

constitutes a set of independent and identically distributed (i.i.d.) random variables. 

The Multi-Stage Randomized Program MRP [a/ 1 ), w^] is constructed as follows: 

min c T x , (2.3a) 

s.t. fifoS**"*)) < V ^ £ Nf 4 , i £ Nf . (2.3b) 

Observe that in problem (12. 3t the target function of (12.1 b ) is minimized while enforcing each nominal 
chance constraint for all values in its corresponding random multi-sample. Thus problem (12.3b is stochastic 
in the sense that its solution depends on the random samples in u). It is important only for analytic purposes, 
while it is actually solved for the observations of the random samples, leading to the deterministic version 
of the Multi-Stage Randomized Program MRP[u (1) , 

min c T x , (2.4a) 

s.t. MxJ^) < V /«* £Nf\ «ENf . (2.4b) 

Problem ( 12.41 ) arises from (12.3) through replacement of the (random) samples S^' Ki \ uj^\ and lo by their 
(deterministic) outcomes S^ ,Ki \ Co^\ and Co. Throughout the paper, the outcomes are distinguished from 
their random variables by a bar. Due to Assumption (12. 1) , problem ( 12.41 ) is a convex program of a pre-known 
structure. As such it can be solved efficiently by choice of a suitable algorithm from convex optimization, 
see e.g. EHSI23. 

Remark 2.5 (Relation to the Scenario Approach) (a) Problem ( 12.31 ) is a generalization of a 'random 
convex program' HlOV or a 'scenario program' M2V : instead of only a single stage (N — I), it may contain 
multiple stages (N > 1). This can be relevant to applications involving several independent chance con- 
straints, resulting e.g. from different conditions on local sub-systems or time steps in a multi-stage decision 
problem, (b) Problem i2.4\ provides the cost-optimal solution that is feasible with respect to all nominal 
constraints i = 1, N in ( 12. Il l under each of the scenarios in Q^ 1 ', U)( N \ In this sense the proposed 
randomization method can be related to the classic scenario approach, however the scenario numbers Ki 
used in the problem may generally differ between the stages, (c) An approximate solution to ( 12.11 ) can also 
be found by a reformulation using only a single stage, and hence applying the existing theory for random- 
ized programs (RPs). Here the nominal chance constraint is defined as the point-wise maximum of the 
functions 6), /jv( - , 8) and the chance constraint level as the minimum of E\, En- However, in 
general this approach comes at the price of an increased conservatism and therefore a higher cost of the 
solution Ml\ Sec. 2.1.2], as demonstrated also in the example of Section^ 

The fundamental goal of the randomization theory is to show that the solution to ( 12.31 ) has a limited 
violation probability, that is the probability of violating the nominal chance constraints (12.1b ). This basic 
notion is recalled in the next section. 
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2.3 Randomized Solution and Violation Probability 



In order to avoid unnecessary technical complications, the following assumption is introduced to ensure 
that there always exists a feasible solution to problem (12.41 >. similar to lfl2l p. 3]. 

Assumption 2.6 (Feasibility) (a) For any number of samples Ki, Kn, the randomized problem ( 12.31 > 
admits a feasible solution almost surely, (b) For the sake ofnotational simplicity, any Yx-null set for which 
(a) may not hold is assumed to be removed from A. 

Feasibility can be taken for granted in the majority of practical problems; if it does not hold for a 
particular application, a generalization of the presented theory to include the infeasible case follows the as 
shown in iflOl . 

Given the existence of a solution to d2.41 >. uniqueness follows by Assumption 12. 1 1 and by carrying over 
the tie-break rule of Assumption ^. 3f b). see 1122] Thm. 10.1, 7.1]. Hence the solution map 

x* : A K -> X , (2.5) 

which returns as x*(d)^ 1 \ q( n ') the unique optimal point of MRP^' 1 ', ...,u)( N '] for a given observation 
of the multi-samples {Q^, ...,u>( N >} 6 A K , is well-defined. The solution map can also be applied to 
problem MRP[u;W, u'"'], for which it is denoted by x* : A K -> X. Then x*(lj ( - 1 \ wM) € R d 
represents a random vector whose probability distribution is unknown; in fact it is a complicated function 
of the geometry of the problem and its parameters. This random vector will also be referred to as the 
randomized solution. 

Based on a given choice for the sample sizes K\ , . . . , Kn, the goal is to obtain a bound on the probability 
with which the randomized solution x*(u>^\ uW) violates the nominal chance constraints of the MSP 
for the actual uncertainty S, Apparently, this involves two levels of randomness present in the problem: 
the first introduced by the random samples in u>, which determine the randomized solution; and the second 
introduced by the random uncertainty S, which determines whether or not the randomized solution violates 
the nominal chance constraints in (12.1b ). For this reason, the randomization approach of this paper is also 
called a double-level of probability approach [9, Rem. 2.3]. 

To highlight the two probability levels more clearly, suppose first that the multi-sample uj has already 
been observed, so that the randomized solution x* {Cj^, u>( N ^) is fixed. Then for every stage i = 1,...,N 
in ( 12.1b ). the ex-post violation probability Viifii^ 1 ', ui^) is given by 

^(fiW, :=Pr{i 6A|/ j (i*(fl( 1 ),...,isW) 1 i) >0} . (2.6) 

In particular, each Vi is a deterministic value in [0,1]. Now, if the multi-sample oj has not yet been observed, 
the randomized solution x* (uj^, uj^) is a random vector and so the ex-ante violation probability 

V i (u( 1 \...,wW):=Pi{6eA\f i (x*(u;W,...,u;W),5)>0} . (2.7) 

becomes itself a random variable with support in [0, 1], for every stage i = 1 N in ( 12.1b ). Hence the goal 

is to ensure that V»(u;W, ...,uj( n ^) < Si for all i = 1, N, with a sufficiently high confidence probability 
(1 — 6i). Before these results can be derived however, some structural properties of the randomized program 
and technical lemmas ought to be discussed. 

3 Structural Properties of the MRP 

This section discusses some fundamental facts on RPs, partly based on existing theory lfT0lFl"2l but also 
introduces some new concepts and results. This lays the technical groundwork that is required for the proof 
of the main results of this paper. Many of the concepts presented here are in their essence even more general, 
see Q3 1, but this lies beyond the scope of this paper. 
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3.1 Support and Essential Constraint Sets 



Similar to the solution map and violation probability, the new concepts of this section are defined first for 
the for the deterministic problem MRP, and then carried over in a probabilistic sense to the stochastic 
problem MRP. 

Definition 3.1 (Support Constraint Set) For some i 6 and m € Nf \ constraint fc(x, < is 

a support constraint of ( 12.4-b if its removal from the problem entails a change in the optimal solution: 



Definition 13 . 1 1 could be stated equivalently in terms of the objective value IflOl p. 3430]: a constraint is 
a support constraint if and only if the optimal target function value (or its preference by the tie-break rule) 
is strictly larger than when the constraint is removed. 

In the stochastic setting of the MRP, a particular random sample fit 1 -*^ generating a support constraint 
becomes of course a random event, which can be associated with a certain probability. Similarly, the support 
constraint sets Sci, Scat and Sc are naturally random sets. 

While the removal of a single non-support constraint from MRP does not affect its solution, this does 
not mean that the non-support constraints can be omitted from the problem all at once without a change of 
the optimal point. For example, the configuration in Figure IXTT c) includes only one support constraint, but 
the solution would change if the other constraints were removed all at once. This basic fact is the motivation 
for the definition of an essential set. 



Definition 3.2 (Essential Constraint Set) An essential (constraint) set of MRP [u>i, cDjy] is a subset of 
the constraints in A2.4b ). generated by some {u>i, ...,u>n} Q {pi, ...,cjn}, far which the following two 
conditions hold: (a) the solution of the reduced problem remains the same as that of the full problem, i.e. 



(b) all of the constraints in the essential set are support constraints of the reduced problem MRP[(Di, tDjv]. 

To be precise, Definition 13.21 should also allow for the set X to be an element of the essential set. 
However, this can be considered as a minor subtleness which is assumed to be understood in the sequel. 

Unlike the support constraint set Sc, the essential constraint set of the deterministic problem MRP is not 
necessarily unique; in this case the problem is said to be degenerate. If the problem is non-degenerate, then 
its (unique) essential set is exactly equal to Sc, see QUI Def. 2.7]. For example, the configuration depicted 
in Figure IXTT c) is degenerate as it contains two different essential sets. The stochastic problem MRP is 
called non-degenerate if it is non-degenerate with probability one, and degenerate otherwise. 

Suppose that the MRP is non-degenerate and consider the cardinality card(Sc) of the support set Sc, 
which is a random integer in Nq by [ 1 1 , Thm. 2]. Then problem (12.3b is said to be regular if card(Sc) equals 
to some constant value z £ Nq with probability one. So if problem (12. 3t is non-degenerate and non-regular, 
this means that it has a unique essential set equal to the support set with probability one, even though its 
cardinality takes on at least two different values with non-zero probability. An example of this situation is 
found in Figure [3~T1 supposing that both situations (a) and (b) can occur with non-zero probability. 

If degeneracy and non-regularity are simultaneously present in MRP, then the above concept of regu- 
larity is adjusted by considering the minimal cardinality of all essential sets. In other words, problem (12.31 l 
is degenerate and non-regular if the minimal cardinality of all essential sets takes on at least two different 
values with non-zero probability. It is straightforward to imagine such a case with help of the following 
Example [33] 
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(a) Non-degenerate, card(Es) = 2. (b) Non-degenerate, card(Es) = 1. (c) Degenerate, card(Es) = 2. 

Figure 3.1: Illustration of Example 13. 3 1 in R 2 . The arrow represents the direction of lower target function 
values — c, bold lines indicate the support constraints of the respective configuration. 

Example 3.3 For an illustration the above concepts, consider a multi-stage RP (12.3b in dimension d = 2, 
for which Figure IXTl shows three configurations of sampled constraints. 

The problem is non-degenerate and regular if its constraints are in a generic configuration, as sketched 
in either Figure IXTT a) and (b), with probability one. This means that with probability one there are exactly 
two support constraints (or there is exactly one support constraint), which also form(s) the unique essential 
set. 

Degeneracy means that a situation as in Figure 13. li e) can occur with non-zero probability, where the 
essential set fails to be uniquely determined. Here there are two essential sets, both containing the support 
constraint and both having the cardinality of two. 

Non-regularity, under non-degeneracy, means that the essential set is unique with probability one, but it 
can be of varying cardinality. For instance, if the two configurations of Figure IXTT a.b) can both occur with 
positive probability, then the problem is non-regular and non-degenerate. 

Finally, the problem is degenerate and non-regular if both phenomena can occur with a non-zero prob- 
ability. This concludes Example 13.31 and naturally leads to a closer examination of the consequences of 
degeneracy and non-regularity. 

3.2 Degeneracy and Non-Regularity 

Both degeneracy and non-regularity play an important role in the derivation of the main results of this paper, 
and therefore need to be examined more closely. Note that it seems unrealistic to rule out either possibility 
by assumption, because their presence in very simple examples suggests that they do occur in a variety of 
practical applications. 

Previous results for the single-stage RP case (where N = 1) start with the simplifying assumption 
that problem (12.31 l is non-degenerate and regular with probability one. In another step, this assumption 
is then relaxed in order to include the exceptional cases, which adds a significant level of complexity to 
the arguments. Leveraging these insightful arguments in lfT0l[T2l (for N = 1), for the sake of brevity 
degeneracy and non-regularity shall be dealt with in an integrative way. Some preliminary technical results 
are required for this and stated below. 

Degeneracy 

The authors of 1121 propose a 'heating-cooling' procedure transforming the degenerate constraint system 
into a non-degenerate system (and back), where both systems are shown to have the same constraint vi- 
olation characteristics. A similar argument is used in ifTOl . relying on existing infinitesimal perturbation 
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techniques in order to recover a (arbitrarily close) non-degenerate constraint system from the degenerate 
one. 

Similarly to the cited procedures, the approach chosen here derives a non-degenerate problem from 
the degenerate one, but in a slightly different manner. Namely, a unique essential set is selected by the 
introduction of another tie-break rule, used for determining whether or not a constraint is considered to be 
essential. Concretely, this means that in addition to its generating random sample S^' Ki \ every constraint 
in (12.3b ) has an associated tie-break random variable which is, for instance, uniformly distributed in [0, 1]. 
When two (or more) different essential constraint sets are compared, a unique one can be selected with 
probability one, based on the lowest sum of tie -break values. 

Definition 3.4 (Minimal Essential Set) The minimal essential (constraint) set Es of MRP is the ( with 
probability one) unique essential set that (a) has the minimal possible cardinality and (b) has the lowest 
sum of tie-break values among all minimal essential sets. 

To be precise, in Definition 13.41 a tie-break random variable is assigned to the set X as well, which 
may also be included in the minimal essential set besides the sampled constraints of (12.4b ). Hence the 
uniqueness of the minimal essential set Es, by virtue of the tie-breaker, allows to refer to its elements as the 
essential constraints of problem ( 12.41 l. 

The following lemma is an adaptation of the Sampling Lemma in [15, Lem. 1.1] for the purposes of 
multi-stage RPs. 

Lemma 3.5 (Sampling Lemma) Suppose another sampled constraint of stage i G , for the sample S, 
is added to the problem MRP[w (1) , w (JV) ]: 

fi(x,S)<0. (3.1) 

If the solution x*(u)^\ ui^) violates this additional sampled constraint, then the latter must be an 
essential constraint of the extended problem MRP[o)W, ...,U)^ 1 \u^ U {5}, 

Proof. It is an immediate consequence of Definition l3.1l that the solution x* (uj^ , Q W) violates the 
additional sampled constraint ( 13. Il l if and only if the latter is a support constraint of the augmented problem 
Mff[td( 1 ',..,wl'- 1 ),w(')uW J w('+ 1 ) 1 ..,#)]. 

But if the new constraint ( I3.lt is a support constraint of the augmented problem, then it is part of all 
essential constraint sets by [10, Lem. 2. 10], and therefore it must be contained in Es. □ 

Non-Regularity 

The approach of [12] tackles non-regularity by the introduction of a 'ball solution', with the goal of recov- 
ering the property of full support. In ifTol . non-regularity is covered only in a deterministic sense, namely 
as far as it can be incorporated in the definition of 'Helly's dimension'. In this paper, the issue of non- 
regularity is resolved by the introduction of a conditional probability measure, where the conditional event 
is a fixed cardinality of the essential set. 

For this purpose however, first the notion of regularity (as defined in Section 13. It has to be refined to 
account for the presence of multiple stages. Observe that the minimal essential set Es of (12. 4t can be broken 
down into 

Es = Es x UEsiUEs 2 U...UEs i v , with Es x £ {0,X} , 

where Esi, Esat denote the essential constraints contributed by the corresponding stages i = 1, N. 

In the stochastic setting of the MRP, the sets Es and Esi, Esa? are random, and hence their cardi- 
nalities z := card(Es) and Zi :— card(Esi) are random integers. In the sense of the previous definition of 
regularity, the multi-stage RP is regular if z equals to some constant value almost surely; it shall be called 
multi-regular if for all i £ the individual cardinalities Zi equal to some constant values almost surely. 
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The following lemma will be important for tackling problems that fail to satisfy the property of multi- 
regularity. Suppose in problem (12. 3t the number of samples is varied for each stage, while the sample sizes 
of the other stages remain constant. For example, in the case of the first stage i = 1 the varying sample size 
is denoted by K\ (for the purpose of distinction from the actual sample size K\), while the sample sizes 
for the other stages retain their constant values K2, K^. The resulting total number of samples is then 
denoted by K := K\ + K% + ... + Kn, and the corresponding multi-samples by uj^\ Co instead of w^w, 
respectively. 

Lemma 3.6 (Multi-Regularity) Consider problem MRPfo/ 1 ), ...,Lo ( - i ~ 1 \Co ( - i \uj ( - i+1 '> , ...,u;W] for vary- 
ing sizes Ki of the multi-sample Co^ l \ For any Z{ G N, 

Pr K {Co e A R I card(Esj) = z,} (3.2) 

is either positive for all values of Ki > + 1, or zero for all values of K t > Z j + 1. 

Proof. The result is shown for the first stage i = 1 only, whence it follows for all stages for reasons of 
symmetry. 

Assume that the probability in ( 13. 2t is positive for some K\ > z± + 1; then with some non-zero 
probability there exists at least one sampled constraint which is not in the essential set Esi. Moreover, by 
virtue of Assumption |2.4| all of the Ki sampled constraints are i.i.d. and thus none of them can be any more 
or less likely than another to become an essential constraint. So the probability that any specific Ki — 1 out 
of the K\ sampled constraints contain all z\ essential constraints must be positive. Thus the probability of 
having an essential set of cardinality z\ for K\ — 1 samples is positive. 

Following the above argument, the probability that any specific single sampled constraint out of the K\ 
sampled constraint is not essential is non-zero. Since an additional K\ + 1-th sample is again i.i.d. to the 
previous K\ samples, the probability that the K\ + 1-th sampled constraint is not essential is positive. Thus 
the probability of having an essential set of cardinality z\ for K\ + 1 samples is also positive. 

The claimed result now follows by induction on K\. if the probability in (13. 2t is positive for some 
K\ > z\ + 1, then it must be positive for all K\ > z\\ consequently, if it is zero for some K\ > z\ + 1, 
then it must be zero for all K\ > z\ □ 



4 Dimensions of the Multi-Stage Problem 

The link between the sample sizes Kx, Kn and the corresponding violation probability of the random- 
ized solution of d2.31 > depends decisively on the 'problem dimensions'. In the case of multi-stage RPs, the 
notion of the 'problem dimension' is more intricate than for the single-stage case — where it is simply de- 
termined by the maximum possible value for card(Es) and usually equals to the dimension d of the search 
space I j j |, Not surprisingly, for multi-stage RPs the 'dimension' ceases to be a property of the problem as 
a whole; instead each of the stages has its own associated 'dimension'. 

4.1 The Support Dimension 

The 'dimension' of a specific stage i £ in the problem, embodied by the new concept of a support 
dimension (or s-dimension), is defined below. 

Definition 4.1 (Support Dimension) (a) The support dimension (or s-dimension) of a stage i G in 
( 12.31 l is the smallest integer Q that satisfies 

ess sup card(Esi) < £i . 



10 



(b) Helly's dimension is the smallest integer £ that satisfies 

ess sup card(Es) < ( . 

u>eA K 

It should be pointed out that (finite) integers £ and £i, (jv matching Definition 14. 1 1 always exist, so 
that the concepts of an s-dimension and Helly's dimension are indeed well-defined. This follows from the 
fact that there exists an upper bound on the total number of d support constraints in a (feasible and convex) 
d-dimensional optimization problem, e.g. (TTj Thm. 2]. The latter result also provides immediate upper 
bounds on the s-dimensions, namely Q < C < d f° r every i G . 

In general, Q represents an upper bound for card(Esj), which is used for determining the sample size 
Ki of stage i in Sections [5] [6] Yet in many practical problems the value of Q may not be known exactly; 
in this case it has to be replaced by an upper bound Q of the actual s-dimension Q. As the upper bound 
Cj' has, in fact, a positive influence on the sample size Ki, the tightness of this bound has a decisive impact 
on the solution quality (as suggested by Proposition 14.21 below). Helly's dimension provides a universal 
upper bound that is often easy to compute, but in many cases it can be significantly improved by exploiting 
structural properties of the constraints. 

Proposition 4.2 (Probability Bound) Consider some stage i G Nf of MRP[w (1> , cj (Ar) ], and let Q 

be an upper bound of its s-dimension Q in the problem. Then the probability for any sample S^' Ki \ for 
Ki G \ of generating an essential constraint of (12.31 ) is bounded almost surely by 

Pr K {u G A K | Ki G Es} < j^- . (4.1) 

Proof. By virtue of Assumption 12.41 all samples in to are independent, whence the event in ( 14. It can 
be measured by the i^-th product measure of Pr, and also identically distributed, whence all constraints 
Ki G Nf ; generated for a fixed i in (12.3b ) are probabilistically identical. Thus none of them can be any 
more or less likely than another to become an essential constraint. 

The number of essential constraints Zi := card(Esi) is a random variable of unknown distribution. 
According to Definition l4.1f a). z,- L < Q almost surely, and by assumption Q < 

So consider any event in which Zi = Zi, where Zi is a constant value for which the event in (14. j} has 
non-zero probability. This last condition ensures that the conditional probability of a specific m S N± 
generating an essential constraint, given that Zi — z~i, 

Pif t=Sz {co G A K | Ki G Es} , (4.2) 

is well defined according to [j5] Cha. 1.4] or ll24l Cha.1.3]. For the notation of the conditional probability 
measure in (14.21 i. an index is added to the measure Pr, indicating the conditional event 'zj = zH . 

Since by the first argument all Ki sampled constraints of stage i have the same probability of becoming 
an essential constraint, given that there are exactly z% essential constraints this probability equals to 

Pr£ =z - t {w G A K | m G Es} = f- . (4.3) 

The claim follows from (14.3b and the fact that for any possible event the bound Zi = Zi < Q holds. □ 

4.2 The Support Rank 

One possible property of a stage to rely on for upper bounds on its s-dimension proposed in this paper is the 
support rank (or s-rank) of the stage. In many problems it can be much tighter than Helly's dimension, in 
particular if the search space dimension d is large. Intuitively speaking, the s-rank of any stage i G in 
( 12.3b is the dimension d of the search space R d less the maximal dimension of an unconstrained subspace, 
which is the maximal subspace that cannot be constrained by the sampled constraints of stage i, no matter 
the outcomes of the samples in cjw. In this respect, a very mild technical assumption is introduced that 
corresponds to similar conditions imposed for single-stage RPs [ 10 Thm. 3.3] and [12 Def. 3.1]. 
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Assumption 4.3 The sample sizes Kx, Kpf observe the lower bounds 

Ki>Q + l V i G Nf . (4.4) 

Before introducing the s-rank in a rigorous manner, three examples of constraint classes whose s-rank 
is easily bounded are given below. This equips the reader with the necessary intuition behind this concept, 
which can be used to determine the s-rank of a stage immediately for many practical problems. 

Example 4.4 For each of the following cases, a visual illustration is provided in Figure |4~T1 

(a) Single Linear Constraint. Suppose some stage i G in ( 12.1b ) takes the linear form 

fi(x,8) = a T x-b(S) , (4.5) 

where a G K d and b : A — » K is any scalar that depends on the uncertainty. Then these constraints added 
to problem ( 12. 3t for the samples S^ 1 ' 1 ', S^' Ki ' are unable to constrain any direction in {a}^, no matter 
the outcome of the multi-sample ui. Hence the s-rank of the stage in (14.5b equals to a = 1. 

(b) Multiple Linear Constraints. For a generalization of case (a), suppose that some stage i G in 
d2.1b ) is given by 

fi(x,5)=A(5) T x-b(5) , (4.6) 

where A : A — > M. rxd and b : A — > W represent a matrix and a scalar that depend on the uncertainty 5. 
Moreover, suppose that the uncertainty enters the matrix A(S) in such a way that the dimension of the linear 
span of its rows Aj t .{8), where j = 1, r, is known to satisfy 

dimspan{Aj V ((5) | j E N[, S 6 A} < /3 < d . 

Then these constraints added to problem ( 12. 3t for the samples S^' 1 " 1 , S^' Ki ^ are unable to constrain any 
direction in the orthogonal subspace of the span, so the s-rank of the stage in j4.6i equals to f3. 

(c) Quadratic Constraint. For a nonlinear example, consider the case where some stage i G in 
d2.1b ) is given by 

fi(x, S) = (x- x c (S)) T Q(x - x c {6)) - r(S) , (4.7) 

where Q G R dxd is positive semi-definite with rank Q = 7 < d, and x c : A — > E rf , r : A — > R + represent 
a vector and scalar that depend on the uncertainty. Then these constraints added to problem (12.3) for the 
samples S^' Ki ^ are unable to constrain any direction in the nullspace of the matrix Q, which has 

dimension d — 7, and so the s-rank of the stage in (14. 7t equals to 7. 
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After this intuitive example, the s-rank is now introduced in a rigorous manner for a fixed stage i G 
of (12.3b . For every point x G X and every value S G A, denote the corresponding level set of /j : K d — >• M 
by 

Fi(x,S) := {££R d | Mx + ^S) =Mx,S)} . (4.8) 

Let £ be the collection of all linear subspaces in R d . In order to be unconstrained, only those subspaces 
should be selected that are contained in all level sets Fi{x, 5): 

A:= f| r\{L£C\LcFi(x,5)} . (4.9) 

<5GA i6X 

More precisely, in (14.91 any Pr-null set could be removed from A that adversely influences the subsequent 
definition of the support rank; however this is not made explicit here to avoid an unnecessary complication 
of the notation. 

Proposition 4.5 (Well-Definition of Unconstrained Subspace) The collection Ci contains a unique max- 
imal element Si in the set-inclusion sense, i.e. Si contains all other elements of Ci as subsets. 

Proof. First, note that Ci is always non-empty, because for every x G X and every 5 G A the level set 
Fi(x, 5) includes the origin by its definition in (14.81 . and therefore Ci contains (at least) the trivial subspace 
{0}. 

Second, introduce the partial order on Ci defined by set inclusion; i.e. for any two elements 
La, Lb G Ci, La Lb if and only if La Q Lb- Since every chain in Ci has an upper bound (namely 
K d ), Zorn 's Lemma (which is itself equivalent to the Axiom of Choice, see JTj p. 50]) implies that Ci has at 
least one maximal element, in the set-inclusion sense. 

Third, in order to prove that the maximal element is unique, suppose that La, Lb are both maximal 
elements of Ci . It will be shown that La © Lb G Ci , so that La ^ Lb would contradict their maximality. 
According to (14. 9t , it must be shown that their direct sum La © Lb C Fi(x, S) for any fixed values x G X 
and 5 G A. To see this, pick 

£ G La ffi La => £ = U + U for £4 £ L A , £,b G Lb , 
and then apply ( 14. 8t twice to obtain 

fi(x + £, A +£,b,S) = fi(x + ( A ,S) = fi(x,S) , 
because £4 G La and £4 G Lb- □ 



Definition 4.6 (Unconstrained Subspace, Support Rank) (a) The unique maximal element Si G Ci of 
Proposition \4.5\ is called the unconstrained subspace of stage i G . (b) The associated support rank (or 
s-rank) £' G Nq is defined as d minus the sub-space dimension of Si, 

Ci '■= d — dim Si . 

Note that if Ci contains only the trivial subspace, then the support rank actually equals to its upper 
bound d; on the other hand, if Ci contains more elements than the trivial subspace, then the support rank 
becomes strictly lower than d. The following theorem connects the s-rank to the s-dimension of a stage, 
proving that the s-rank of a stage represents an upper bound to its s-dimension. 

Theorem 4.7 (Rank Condition) Suppose that in ( 12.31 l some stage i G has the s-rank Q[ G Nf ; then its 
s-dimension is bounded by Q < £|. 
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Proof. Without loss of generality, the proof is again given for the first stage i = 1. Pick any random 
multi-sample lj £ A K (less any Pr K -null set for which the s-rank condition of Definition 14.61 may not 
hold). 

By the assumption, there exists a (closed) linear subspace Si C M. d of dimension d — Ci for which 

h(x + =h(x) Vx, (x + ex, £eSi . 

Let Si denote the orthogonal complement of Si; recall that it is also a (closed) linear subspace in R d of 
dimension ([ and that every vector in M. d can be uniquely written as the orthogonal sum of vectors in Si 
and 5^, e.g. Bp- 135]. 

For the sake of a contradiction, suppose that i = 1 contributes more than ([ essential constraints to the 
resulting multi-stage RP ( 12.4b . Denoting the samples that generate the essential constraints by 

u>i := {(S^-^) | Ki G Esi} Vi = l,...,iV , 

this means that card(wi) > ([ + 1. It will be shown that this contradicts the optimality of the solution of 

5S:=i*(wW,...,wW) =i*(o;W,...,«W) . 
Some additional notation is introduced to simplify the proof. Define 

X:= f| f) {xGX\fi(x,S^)<0} 

as the feasible set with respect to all constraints, except for those pertaining to stage i = 1; being the 
intersection of convex sets, X is convex. Moreover for any ki 6 Esi, define 

x* K1 :=x*(o)W\{^)},^ 2 ),...,cDW) 

as the new solutions when the essential constraint Ki is omitted from the reduced problem of ( 12.3) . Recall 
that an essential constraint of (12.4) is a support constraint of the reduced problem by Definition l3.2f b). so the 
solution moves away from Xq when a constraint Ki 6 Esi is omitted. Let the collection of all randomized 
solutions be written as 

X ~ {x* Kl \ki eWsi}u{x* } . 

Observe that each x* Kl , for Ki G Esi, is feasible with respect to all constraints, except for the Ki-th one, 
which is necessarily violated according to Definition 13. II In consequence, all elements in X are pairwise 
distinct. 

Since R d is the orthogonal direct sum of Si and S^-, for each point in X there is a unique orthogonal 
decomposition 

x* Kl = v Kl + w Kl , where v Kl € Si, w Kl e 5^ , V m E Esi U{0} . 
Consider the set 

W := {w Kl | Ki e Eii U{0}} 

containing at least Ci + 2 points in the -dimensional subspace S^. Applying Radon's Theorem |26 
p. 151], W can be split into two disjoint subsets Wa and Wb such that there exists a point w in the inter- 
section of their convex hulls: 

w e conv (Wa) n conv(Ws) . (4.10) 

When the indices in Esi U{0} are split correspondingly into I a and Is, it can be observed that every point 
wa € coiiv(Wa) satisfies the constraints 

fi(wA,6 {hKl) )<0 V«i G7 fl • 
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This is true because it holds for every element in Wa and hence, by convexity, for every element in its 
convex hull. For the same reason, every point wb G conv(Ws) satisfies the constraints 

<0 Vki e Ja . 

Hence from (14.10b it follows that 

< VkiGEsi . (4.11) 

According to (14. 10b . w can be expressed as a convex combination of elements in Wa or Wb- Splitting 
the points in X into Xa and Xb correspondingly and applying the same convex combination yields some 

x e conv(Xyi) n conv(X B ) , (4.12) 

and in consequence also some v G S\. 

To establish the contradiction two things remains to be verified: first that x is feasible with respect to 
all constraints, and second that it has a lower cost (or a better tie-break value) than Xq, For the first, x G X 
because all points of X are in the convex set X and x is in their convex hull. Moreover, 

/i(M (1 ' Kl) ) = < V ki G Esi 

because of d4.1U . For the second, take the set from Xa and Xb which does not contain x^; without loss 
of generality, say this is Xa- By construction all elements of Xa have a strictly lower target function value 
(or at least a better tie-break value) than Xq. By linearity this also holds for points in cohv(Xa), and so 
x G conv(X A ) by d4~T2l . □ 



5 Randomized Approximation of MSP Solution 

In this section, the central result regarding the approximation of the solution to the MSP by the solution of 
the multi-stage RP is stated and proven. The main theorem in the first part provides an implicit link between 
the sample sizes K\, Km and the probability measures of the events that V\ < e\, Vat < ejv. Based 
on this result, an explicit relationship is derived in the second part. 

5.1 The Sampling Theorem 

This entire section is concerned with the proof of the following key result. 

Theorem 5.1 (Sampling Theorem) Consider problem (12.31) under all of the previous assumptions. For 
every stage i = 1, N it holds that 

Pr K [V i (o;W,...,a;W)> £i ] 1; , (5.1) 

where £| denotes the s-rank of stage i and $(•; •, •) the cumulative probability distribution of a binomial 
random variable, as defined in Appendix\A\ 

Without loss of generality, it again suffices to prove the result for stage i = 1. As opposed to existing 
results iflOl Thm. 3.3] and |[T2l Thm. 2.4], the proof has to account for the simultaneous presence of the 
sampled constraints of the other stages i = 2, N, as well as the possibility of degeneracy and non-multi- 
regularity. The main idea of the proof is to obtain an upper bound on the distribution of the random violation 
probability V\ (u)^ 1 ', ...,U)( N ') in ( 15.11 ): it is divided into four main parts in order to provide a better overview 
of its fundamental structure. 
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Preliminaries 

As in Lemma [3T6l the number of sampled constraints for stage i = 1 shall be varied in a thought experiment, 
while the other sample sizes K2 , JCjv maintain their constant values. The variable sample size K\ £ 
is used to distinguish this thought experiment from the actual sample size K\, and the variable multi-sample 
u}^ 1 ' to distinguish from the actual multi-sample a/ 1 -*. The total number of samples is then denoted by 
K := K\ + K2 + ... + Kn and the corresponding multi-sample by Co. 

The lack of multi-regularity is accounted for by considering only those cases of z\ := card(Esi) that 

have a non-zero probability for any Ki > £i + 1. Let Zi C Ng 1 be the set of cardinalities of Esi which are 
associated with a non-zero probability measure, 

Z x := {zi £ Nq 1 I Pr K [card(Esi) = zi] > 0} . (5.2) 

According to Lemma [3~6l Z\ is indeed well-defined as it is independent of any choice of the sample size 
K\ which is admissible by Assumption [43] Hence it makes sense to consider a fixed value z\ £ Z\ from 
here on, while keeping the sample size K\ > z\ + 1 as a variable. 

For any K\ £ let E C Nf 1 be an arbitrary subset of cardinality zi that contains the indices 

of Zi specific choices from the K\ sampled constraints. Define Be as the event where E includes exactly 
all of the zi essential constraints contributed to (12.3) by stage i = 1, i.e. where E — Esi. The conditional 
probability of the event Be, given that z\ — z\, 

Pr* =il [Be] := PrJ =5l {" £ A k \ B E ] , (5.3) 

is now computed in two ways and the results are then equated. Note that both computations leave the 
hypothetical sample size K\ open, and hence they work for any selection of the sample size within its 
allowed range of N|° , j . 

Computation of the conditional probability 

The first way of computing the conditional probability in (15.31 ) is an immediate consequence of the fact 
that each subset of Nf^ 1 with cardinality z\ is equally likely to constitute the minimal essential set Esi. 
This is true because all Ki sampled constraints in (12.3b ) for stage i = 1 are generated by i.i.d. random 
variables ( Assumption ^. 4l and Section [3~2l and are thus identical from a probabilistic point of view. Given 
that zi = zi, there exist a total of Ki choose z\ possible choices for Esi, so it follows that 

= 1 ■ (5-4) 

For the second way of computing the conditional probability in ( 15.31 l. observe that, given that card(Esi ) = 

Zi, the event Be occurs if and only if all of the Ki — zi constraints in E c := N^ 1 \ E are not in the minimal 
essential set Esi. Given that card(Esi) = Z\, the probability of Be is therefore equal to the probability that 
all constraints in E c are not essential constraints. To compute this probability, one may start by considering 
the specific case in which ( 12.31 ) includes only the constraints in E, i.e. 

MRP^ 1 ^) \ki £E},J 2 \...,ujW] . 

Given that card(Esi) = Ii, the conditional probability that Esi = E is one. Now extract another 
random sample j( 1 ' Zl + 1 ) for stage i = 1 and let v be the probability that this constraint becomes essential. 
Note that v is itself a random variable since it depends, in particular, on the outcome of the random samples 
(5( 1 ' K i) with Ki £ E, as well as the multi-samples of the other stages u/ 2 ), (and potentially their 
corresponding tie-break values due to Definition 13 .4) . 

The event that the newly sampled constraint does not become essential has the probability (1 — v), in 
which case the essential set remains Esi = E, In this case, the previous procedure can be repeated for 
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another sample S^ 1,Zl+2 \ for which all of the above holds in the same way. Therefore, in terms of v, the 
probability that none of the additionally extracted K\ — z\ sampled constraints become essential is equal 

to (1 - v)(*i- 2 i). 

Even though the probability distribution of v is unknown, some distribution function F Sl : [0,1] — )• [0,1] 
can be introduced as a placeholder. It allows to express the conditional probability that, after extracting K i 
samples, E contains exactly all constraints in Esi, given that z\ = Z\, as 

PrJ =2l [£ E ] = [\l-v)&-^dF Sl (v) . (5.5) 



Upper bound on the distribution 

Since (15.4b and (15.5b express the same quantity, computed from different perspectives, they can be equated. 
This yields a set of integral equations 

J\l ~ v)^-^dF- Zl (v) = W VK.e N°° (5.6) 

which must be satisfied by the unknown probability distribution function F Zl . Here the missing condition 
for K\ — z\ follows trivially from the property that any probability distribution function has integral one. 

Solving ( 15.6b for F Sl is known in the literature as a Hausdorff Moment Problem 11251 . If a solution can 
be found, it is necessarily unique, as shown by [24 Cor. II. 12.1] or ||5] Thm. 30.1]. A comparison with 
the Euler Integral of the first kind [1. 6.2.1 and 6.2.2] immediately reveals that the following choice for 
probability density function solves (15.61 l. from which the desired (unique) distribution function can hence 
be deduced: 



dv 



z lV ^- L => F gl (v)=v zi . (5.7) 



Completing the proof 

Now consider again the actual sample size, which satisfies K% > (j + 1 > Z\ + 1 (by Assumption 14. 31 l. 
and the event £e,ci m which two things hold: (a) as in a specific subset E C N^ 1 of card(-B) = Z\ 
constitutes the minimal essential set Esi, and (b) the probability that a newly extracted constraint becomes 
an essential constraint of the augmented problem is greater than e\. 

With the distribution function F 2l from ( 15.7b , the conditional probability of £e, £i , given that z\ = z\, 
becomes 



p4=h[ £ ^] = f\l-vY K ^dF- Zl (v) 

Jet 

= zi [ i? 1 - 1 (l-v)( Kl - gl) dv 



(5.8) 

= z x B(l - e x ;Ki -zi + l, z{) 

/R E ^j <f(zi-l;/^i,ei) , 

where identity ( IA.6b from Appendix lAl has been used to transform the incomplete beta function B(-; •, •) 
obtained by (I A. 4b into the binomial distribution function $(•; •, •). 

In order to bring the result in (15.8) closer to the claim, consider the new event £ 2l ei in which (a) Esi is 
now an arbitrary subset E C N^ 1 of card(-E) = z\, and (b) the probability that a newly extracted constraint 
becomes an essential constraint of the augmented problem is greater than s\. By the same line of argument 
as above and using ( 15.8b , 



= - l;i^i,£i) . 
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Finally, the Sampling Lemma [331 has shown that the fact that a newly extracted constraint violates the 
current solution implies that it must be an essential constraint of the augmented problem. Therefore the 
probability of violating the current solution is always smaller than or equal to the probability of becoming 
an essential constraint of the augmented problem, i.e. 

Pr^ 2i [F 1 ( W W,...^W)> £l ,] <Pr£ =2l [£ gl , £l ] ■ 
Moreover, for each z\ £ Z\, as defined in (15.21 i. it holds that 

Pr£=*x [Sim] =$(2i-l;^i,ei)<*(Cl-l;«'i,ei) , 

from the monotonicity property of the binomial distribution function that follows from its definition in 
( IA.U . Given that the events of z\ £ Z\ are collectively exhaustive and mutually exclusive, 



Pr*[Vi(w)>ei] = Pr ^{ we AK I ViH> £l } -Pr K [ 



z\ = Zl 



< Y: Pr^U {« E | £,, ex } ■ Pr K [z, = Sl ] 

z 1 ez 1 (5.10) 

= $(C(-l;^i,£i) . 
This completes the proof of Theorem IBTT1 

5.2 Explicit Bounds on the Sample Sizes 

Formula (15. U in Theorem l5. 1 l ensures a confidence level of 1 — $ (£j — 1 ; Ki, £j) that the violation probability 
^(u/ 1 ), cjW) < £j, based on the sample if,; for stage i. However, in practical applications a given 
confidence level (1 — 0j) £ (0, 1) is usually imposed, for which an appropriate sample size Ki needs to be 
identified. 

The most accurate way of finding this sample size is by observing that — 1; Ki,£i) is a monotoni- 
cally decreasing function in Ki and applying a numerical procedure (e.g. based on bisection) for computing 
the smallest sample size that ensures $(^' — 1; Ki, £j) < The resulting Ki shall be referred to as the 
implicit bound on the sample size. 

For a qualitative analysis of the behavior of this implicit bound for changing values of E{ and B. L (and 
also for initialization of a bisection-based algorithm), it is useful to derive an explicit bound on the sample 
size K^ Since formula (15. \\ cannot be readily inverted, the binomial distribution function must first be 
controlled by some upper bound, which is then inverted. The details of this procedure are well-documented 
in the literature and therefore omitted. 

A common approach using a Chernoff bound lTT4l . as shown in [9 Rem. 2.3] and iflOl Sec. 5], provides 
a simple explicit formula for K 









c;-i 







(5.11) 



where log(-) denotes the natural logarithm. As shown in [2j Cor. 1], this bound can be further improved to 
a more complicated formula for Ki: 



Ki>- 

£i 



io <^) + v 2 ^- i)los Q + ^- 1 



(5.12) 
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6 The Sampling-and-Discarding Approach 



The sampling-and-discarding approach has been examined in the context of single-stage RPs fT0l[T3l : it 
can also be extended to multi-stage RPs as shown in this section. The underlying goal is to improve the 
quality of the randomized solution, i.e. its objective value, while keeping the probability of violating the 
nominal chance constraints under control. To this end, the sample sizes Ki are deliberately increased above 
the bounds derived in Section[5] in exchange for allowing a certain number of Ri sampled constraints to be 
discarded ex-post, that is after the outcomes of the samples have been observed. 

Appropriate discarding procedures are introduced in the first part of this section. Then the main result 
of this section is stated, providing an implicit formula for selecting the right pairs of sample size K \ and 
discarded constraints Ri, these may differ again between stages, in particular a non-removal strategy may 
still be followed for some of stages i (by putting Ri = 0). Finally, explicit bounds for the choice of pairs 
Ki and Ri are indicated, in analogy to Section|5] 

6.1 Constraint Discarding Procedure 

For each stage of the problem, the discarding procedure is performed by an arbitrary, pre-defined (sample) 
removal algorithm. 

Definition 6.1 (Removal Algorithm) For each stage i = 1, N of d2.31 >, the (sample) removal algorithm 
is a deterministic function of the multi-samples of all stages, U) £ fi. It returns a subset 
of samples uji 6 fij, in which Ri out of Ki samples have been removed from uji, i.e. card(J)i) = Ki — Ri. 

With view on improving the quality of the solution, of course, the algorithm should aim at lowering 
the objective function value MRP[d;W, as much as possible. Various possibilities for removal 

algorithms are described in detail by [TTOl Sec. 5.1], and further references are provided by |1T3] Sec. 2]. 
Some brief description of important removal algorithms are given below. 

Example 6.2 (a) Optimal Constraint Removal. The best improvement of the objective function value is 
achieved by solving the reduced problem for all possible combinations of discarding Ri out of Ki con- 
straints. However, this generally leads to a combinatorial complexity of the optimal constraint removal 
algorithm, which becomes computationally intractable for larger values of Ri — in particular when con- 
straints of multiple stages have to be discarded. 

(b) Greedy Constraint Removal. Starting by solving the problem for all Ki constraints, a greedy con- 
straint removal algorithm performs Ri consecutive steps, where in each step one single constraint is re- 
moved according the optimal procedure of (a). Different stages can also be handled consecutively, or 
optimally. For most practical problems this algorithm can be expected to work almost as good as (a), while 
posing a much lower computational burden. 

(c) Marginal Constraint Removal. Similar to the algorithm in (b) the constraints are removed consec- 
utively, however based on the largest marginal cost improvement (given by the corresponding Lagrange 
multiplier [8 , Cha. 5]) in each step, instead of the largest total cost improvement. The marginal constraint 
removal algorithm can be designed to remove one constraint of each stage in a single step of the procedure, 
or to proceed consecutively among the stages. 

Existing theory for single-stage RPs ifTU] Sec. 4. 1.1] and |[T3l Ass. 2.2] assumes that all of the removed 
constraints are violated by the reduced randomized solution. While this assumption is also sufficient for 
the purposes of multi-stage RPs, it may turn out to be too restrictive in some problem instances. In fact, 
due to the interplay of multiple stages, it may be impossible to find Ri constraints that are violated by the 
randomized solution — a situation that may also occur in the case of a single stage and the presence of a 
deterministic constraint set X. For this case the alternative Assumption ^. 3t b) of the monotonicity property 
is offered, which is explained below. 
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Assumption 6.3 (Stage with Discarded Constraints) Every stage i G Ni of (12.31 1 with Ri > satisfies 
at least one of the following two conditions: (a) For almost every u> G O, eac/z of the constraints discarded 
by the removal algorithm A\ Ki< (lj) is violated by the reduced solution: 

fi(x*(Qi,...,oj N ),8^' Ki) ) >0 Vi W 6ij\ii . 
(7?) 77ze stage satisfies the monotonicity property of Definition \6.4\ below. 



Definition 6.4 (Monotonicity Property) A stage i G o/ d2.3l ) enjoys the monotonicity property if for 
every Ki G N ant/ almost every uj^ G f/ie following condition holds: each point in the sampled 

feasible set of stage i, 



Xi(w«) := {£ G R | /i(^5 (i ' K<) ) < V Ki e Nf } (6.1) 

where R := R U {±oo}, /s violated by a new sampled constraint only if also the cost-minimal point in 

Xi(wW), 

x*{uj {l) ) := argmin{c 7 'e | £ G X,(u; w )} (6.2) 
/s violated. More precisely, for each (; G X,-(a;W) and almost every S G A, 

/i(£,*)>0 => /i^^^O. (6.3) 

At first glance, Definition (16.41 > appears to be abstract, yet it is easy to check for most practical problems 
without involving any calculations. The following example provides the necessary intuition. 

Example 6.5 (Monotonic and Non-Monotonic Stages) Consider the MRP of (12.31 i in d — 2 dimensions, 
where X = [—10, 10] 2 C R 2 , the target function vector is c = [ 1 ] T , and the number of stages is N = 2. 
For testing the monotonicity of each stage i = 1, 2, problem (12. 3t is considered as a single-stage RP, i.e. the 
compact set X and the respective other stage are neglected. 

(a) Monotonic Stage. Suppose the first stage i = 1 is of the linear form 



<0 V«i = l,...,Jfi , 



where G {-1, 1} and G hM]- Observe that for any number of samples K± G N and any 

sample outcomes, the additional sample 61,82 either cuts off no point from the feasible set of stage i = 1, 
X^u/ 1 )), or the cut-off set includes the cost-minimal point ^({c^ 1 ' 1 ', 5( 1,Kl ^}). Therefore stage % = 1 
is monotonic in the sense of Definition 16. 41 This argument is illustrated in Figure ETTl a). 
(b) Non-Monotonic Stage. Suppose the second stage i = 2 is of the linear form 



4 2 ' K2) 1 



<5f' K2) <0 V K2 -1,...,^ 2 



where S ( ^ K2) G [-1, 1] and 8 { ±' K2) G [-1, 1]. Observe that for any K2 G N it is possible for a new sample 
83,64 to cut off some previously feasible point from ), without removing also the cost-minimal 

point x^d^ 2 ' 1 ), f^ 2 '^ 2 )}). A possible configuration of this type is shown in Figure ISTTT b). Therefore 
stage i = 2 is not monotonic in the sense of Definition ^. 41 
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(a) Monotonic Stage. 



(b) Non-Monotonic Stage. 



Figure 6.1: Illustration of Example |6.5l Non-bold constraints are generated by the multi-sample wW g A^ 
of stage i = 1,2; bold constraints result from the actual uncertainty <5 6 A. Figure (a) depicts a stage for 
which it is not possible for a new sample to cut off a feasible point without removing also the optimum; 
Figure (b) depicts a case where a feasible point is indeed cut off without removing the optimum. 

The usefulness of the monotonicity property for RPs is revealed by the following result, whose proof is 
an immediate consequence of Definition 16. 41 and therefore omitted. 

Lemma 6.6 If a stage i £ Nf of ( 12. 3t enjoys the monotonicity property, then for every K i 6 N and almost 
every cjW g A^: 

Pr[/i(e,*) > 0] < Pr[/i«(wW),«) > 0] V £ e X,-(o;«) . (6.4) 

/« of/ier words, w/f/i probability one every point £ j'n its sampled feasible set Xj(a>W) a violation 
probability less than or equal to that of the cost-minimal point x*(oj^). 

6.2 The Discarding Theorem 

For the sampling-and-discarding approach, the following result holds for the case of multi-stage RPs. 

Theorem 6.7 (Discarding Theorem) Consider problem (12.31 > together with some ex-post discarding algo- 
rithms a\ K *' R '^ returning the reduced multi-samples UJ^ for the stages i = 1, N. Then it holds that 

Pt k [Y^\...,QW)>s,} < (* + £~ 1 \${R i +$-l-,K i ,e i ) , (6.5) 

where Q denotes the support rank of stage i and $(■;■,■) the binomial distribution as defined in Appendix 
® 

Proof. Here the multi-stage RP case is reduced to the single-stage RP case, for which two detailed 
proofs are available in flTO] Sec. 4.1.1] and H3 Sec.5.1]. 

In particular, suppose that Assumption 16.31 a) holds. The proof in lfl3l Sec. 5.1] works analogously for 
an arbitrary stage i S , given that an upper bound of the violation distribution is readily available from 
the proof of Theorem 15. II 

Alternatively, suppose that Assumption 16.31 b) holds. In this case the proof in ifTJl Sec.5.1] can be 
applied directly to the single-stage problem that arises from ( 12.31 ) when all other stages and X are omitted. 
In consequence, ( 16.5b holds for the cost-minimal point of this single-stage RP. But given that the stage is 
monotonic, by Lemma l6~6l it also holds for any feasible point, in particular for the optimal point of (12. 3t . □ 
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An excellent account of the merits of the sampling-and-discarding approach is provided in [13], and 
therefore it need not be restated here. It should be emphasized, however, that the randomized solution 
converges to the true solution of the MSP as the number of discarded constraints increases, given the 
constraints are removed by the optimal algorithm in Example 16. 21 a). 



6.3 Explicit Bounds on the Sample-and-Discarding Pairs 

Similar to multi-stage RPs without ex-post constraint removal in Section [5] explicit bounds on the sample 
size Ki can also be derived for the sampling-and-discarding approach, assuming the number of discarded 
constraints Ri to be fixed. The technicalities, using Chernoff bounds lfT4l . have already been worked out 
for single-stage RPs; refer to ifTUl Sec. 5] for details. The resulting explicit bound becomes 

Ki>- logf 1) + -(Ri + C - 1) , (6.6) 

Si \Vi J Si 

where log(-) denotes the natural logarithm. Similar techniques can be applied for obtaining an explicit 
bound on the number of discarded constraints Ri, assuming the sample size Ki to be fixed. The mathemat- 
ical details and a further discussion are found in [13. Sec. 4.3], and the resulting bound is given by 



Ri < SiKi - (I + 1 - \l2EiKt logf' ^ iKi } ' 1 ') . (6.7) 



7 Example: Minimal Diameter Cuboid 

In this final section the application of multi-stage RPs is demonstrated for an academic example, which 
has been selected to emphasize the potential of randomization in stochastic optimization in general and the 
advantage of multi-stage over single-stage RPs in particular problems. 



7.1 Problem Statement 

Let 8 be an uncertain point in A C R™, where both its distribution and support set may be unknown. The 
objective of this example is to construct the Cartesian product of closed intervals in R™ ('n-cuboid') C of 
minimal n-diameter W, while being large enough to contain the uncertain point S in its i-th coordinate 
with probability 1 — ef, see Figure I7TT1 for an illustration. The task resembles the problem of finding the 
smallest ball containing an uncertain point [15 Sec. 2], except that here the coordinates are constrained 
independently. 

Denote by z € K." the center point of the cuboid and by w £ R™ the interval widths in each dimension, 
so that 

C = {£ e R n | |6 - Zi\ < Wi/2] . (7.1) 
Then the corresponding stochastic program reads as follows, 



min^ n ||u>|| 2 , (7.2a) 

:R" 

s.t. Pr[z, - wj2 < Si < z, +w l /2] > (1 - £*) V i £ N'l . (7.2b) 

Since the objective function is not linear, problem (17.2) first has to be rewritten by an epigraph reformulation 
(compare Remark l2~2f a)) into 



W , (7.3a) 
s.t. H| 2 < W , (7.3b) 
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Figure 7.1: Illustration of the numerical example for K 2 . The point S € A appears at random in R 2 , 
according to some unknown distribution; the points drawn here represent 166 i.i.d. samples of S. The 
objective is to construct the smallest product of two closed intervals ('2-cuboid'), indicated here by the 
shaded rectangle, such that the probability of missing S is smaller than £i, e% in the respective dimension 



Note that ( 17.31 ) has the form of an MSP (12. Il l, for a d = 2n + 1 dimensional search space and N = n 
stages. In particular, the objective function (17. 3k ) is linear, as it is a scalar; constraint (17.3b ) is deterministic 
and convex; and each of the n stages ( 17.3b ) contains a nominal chance constraints that is convex in z, v for 
any fixed value of the uncertainty S G A; see [8 , Cha. 3]. Each of the stages i = 1, n depends on exactly 
one component Si of the uncertainty, which is a special case of dependence on the entire uncertainty vector 
S (compare Remark |2~2] c)). The convex and compact set X is constructed from the positivity constraints on 
w, the deterministic and convex constraint (17. 3b ), and some very high artificial bounds introduced for all 
variables. Existence of a feasible solution, and hence the satisfaction of Assumption ^. 31 is intuitively clear 
from the problem setup. 

7.2 Randomization Approach 

To solve d7.3t by the randomization approach, observe that each of the stages i = 1, ...,n has support 
rank Q = 2 in the MRP corresponding to (17.3) , as it only depends on the two variables z t and u;.;. For a 
fixed confidence level, e.g. 6 — 10~ 6 , the implicit sample sizes K\, ...,K n in ( 15. Il l can be computed for 
a given problem dimension n and chance constraint levels e\, e n > by a bisection-based algorithm 
(see Section l5T2l i. For simplicity, the chance constraint levels are selected as equal for all stages, so that the 
implicit sample sizes are all identical. 

Given the outcomes of all random multi-samples ujW, ...,w( n \ the randomized program is easily solved 
by the smallest n-cuboid that contains all sampled points; see also Figure |7TT| In other words, the resulting 
MRP can be determined analytically and without the use of any iterative optimization scheme. Table 
17. U a) summarizes the minimal sample sizes required for guaranteeing various chance constraint levels 



1,2. 



Pr max{ 



Zi - Wi/2 - Si, -Zi - Wi/2 + Si} < > (1 - 

V i e N" . 



(7.3c) 
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in various dimensions n (with confidence 9 = 10 6 ). The indicated numbers represent the (identical) 
implicit sample size for all stages i = 1, n, rather than the ones obtained by explicit bounds (15. lit or 

One can compare these sample sizes to those resulting from the single-stage RP theory (Table ITTt . and 
also the corresponding solution quality (Table fL2[ . In order to apply the single-stage RP theory, first all of 
the (nominal) constraints in ( I7.3b .c) are combined into a single stage by (a) taking its constraint function 
as the point-wise maximum of the constraint functions i = 1, n in ( 17.3b ). and (b) taking the constraint 
level as min{ei, e„}. It is easy to realize the conservatism introduced by this procedure (as described in 
more detail in Section[T|i. 



sample 






cuboid dimension 


n = 






size Ki 


2 


3 


5 


10 


50 


100 


500 


1% 


1,734 


1,777 


1,831 


1,903 


2,072 


2,144 


2,311 


5% 


341 


349 


360 


374 


407 


421 


454 


£i ~ 10% 


166 


170 


176 


182 


199 


205 


221 


25% 


62 


63 


65 


67 


73 


76 


82 






(a) Multi-Stage Randomized Program 








sample 






cuboid dimension 


n = 






size Ki 


2 


3 


5 


10 


50 


100 


500 


1% 


2,334 


2,722 


3,431 


5,020 


15,588 


27,535 


115,786 


5% 


459 


536 


677 


992 


3,095 


5,477 


23,093 


Si = 

10% 


225 


263 


332 


488 


1,533 


2,719 


11,506 


25% 


84 


99 


125 


186 


595 


1,063 


4,550 



(b) Single-Stage Randomized Program. 



Table 7.1: Implicit sample sizes K\ = ... = K n for the multi-stage and the single-stage RP, based on a 
confidence level of 9 = 10~ 6 , depending on problem dimension n and chance constraint levels e\ = ... = 

As a result, the sample sizes based on the single-stage approach are significantly larger than those of 
the multi-stage approach, as it can be seen from Table l7Tl b). In particular, it is not surprising that the 
multi-stage approach scales a lot better with growing dimensions n of the problem — the reason being that 
the support dimension of each stage does not grow with n, in contrast to Helly's dimension. The remaining 
small growth of the sample size with the dimension in Table 17. lT a) owes to the fact that in the multi-stage 
approach the confidence level 9 needs to be (evenly) distributed among the stages, i.e. 9i — 9/n for all 
i = 1, n, in order to allow for a fair comparison to the single-stage approach using 9. 

A larger sample size increases the data requirements and computation effort, but also diminishes the 
quality of the randomized solution. The latter effect is quantified in Table 17.21 where the objective function 
value achieved by the single-stage RP is compared to that of the multi-stage RP (for the sample sizes in 
Table 17. it . The indicated values represent the averages over one million RPs, using a multi-variate standard 
normal distribution for the uncertainty S. 
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relative 






cuboid dimension 


n = 






obj. value 


2 


3 


5 


10 


50 


100 


500 


1% 


2.4% 


3.4% 


5.0% 


7.5% 


14.8% 


18.4% 


26.9% 


5% 


3.3% 


4.6% 


6.6% 


9.8% 


18.9% 


23.8% 


34.4% 


Ei ~ 10% 


3.9% 


5.4% 


7.6% 


11.5% 


22.2% 


27.4% 


39.3% 


25% 


5.0% 


7.2% 


10.1% 


15.1% 


28.5% 


34.7% 


49.1% 



Table 7.2: Objective function value of the single-stage case as relative surplus of the multi-stage case, based 
on the sample sizes in Table 17.1 l and a multi-variate standard normal distribution for 5. Each of the indicated 
values represents an average over one million randomized solutions. 
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A Probability Distributions 

Several basic probability -related functions are used throughout this paper. The Binomial Distribution Func- 
tion^?. 26. 1.20] 

<l>(x;K,e) :=]T ( K )e*(l - e) K ~* (A.l) 

3=0 ^ 3 ' 

expresses the probability of seeing at most x 6 Nq successes in K £ N independent Bernoulli trails, where 
the probability of success is e € (0, 1) per trial. The (real) Beta Function Q] p. 6.2.1] 

B(o,6) := /V^l-O^de (A.2) 
Jo 

is defined for any parameters a, b £ R + , and £ £ (0, 1); it also satisfies the identity fl] p. 6.2.2] 

B(a,6)=B(M) = 3^ , (A.3) 

where V : M.+ — > M + denotes the (real) Gamma Function with T(n + 1) = n! for any n € Ng° HI p. 6.1.5]. 
The corresponding Incomplete Beta Function [ 1 , p. 6.6. 1] is then given by 

B( £ ; a, b) := f ^(l - 6 " 1 d£ = / ^(l - d£ , (A.4) 

JO JX-e 

where the last equality follows by a simple substitution. An important identity is obtained from [ 1 , pp. 3.1.1,6.6.2,26.5.7], 

a+b—l , I l _ -i \ 

B(e;o,6)=B(a,6) ^ ( 1^(1 - £) a+6 ^ J ' , (A.5) 

j=a \ J / 

which can written more compactly by use of the binomial distribution (IA.lt , see for instance ifTUl p. 3437]: 
B(e;a,b) = ^( a + b b ~ 1 \ $(6 - 1; a + b - 1, 1 - e) . (A.6) 
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